Using Past Speaker Behavior to Better Predict Turn Transitions
نویسندگان
چکیده
Using Past Speaker Behavior to Better Predict Turn Transitions Tomer M. Sagie Meshorer Master of Science Center for Spoken Language Understanding within the Oregon Health & Science University School of Medicine June 2017 Thesis Advisor: Prof. Peter A. Heeman Conversations are at the core of everyday social interactions. The interactions between conversants are preformed within the realm of a sophisticated and self-managed turn taking system. In human conversations, the turn taking system supports minimal speaker overlap during turn transitions and minimum gaps between turns. Spoken dialogue systems are a new form of conversational user interface that permits users to use their voice to interact with the computer. As such, the turn taking capabilities of SDS should evolve from a simple timeout to a more human-like model. Recent advances in turn taking systems for SDS use different local features of the last few utterances to predict turn transition. This thesis explores using a summary of past speaker behavior to better predict turn transitions. We believe that the summary features represent an evolving model of the other conversant. For example, speakers who typically use long turns will be likely to use long turns in the future. In addition, speakers with more control of the conversation floor will be less likely to yield the turn. As the conversational image of the speaker evolves as the conversation progresses, other speakers might adjust their turn taking behavior in response. We computed two types of summary features that represent the current speaker’s past turntaking behavior: relative turn length and relative floor control. Relative turn length measures the current turn length so far (in seconds and words) relative to the speaker’s average turn length.
منابع مشابه
Timing in conversation: The anticipation of turn endings
We examined how communicators can switch between speaker and listener role with such accurate timing. During conversations, the majority of role transitions happens with a gap or overlap of only a few hundred milliseconds. This suggests that listeners can predict when the turn of the current speaker is going to end. Our hypothesis is that listeners know when a turn ends because they know how it...
متن کاملRespiratory Turn-Taking Cues
This paper investigates to what extent breathing can be used as a cue to turn-taking behaviour. The paper improves on existing accounts by considering all possible transitions between speaker states (silent, speaking, backchanneling) and by not relying on global speaker models. Instead, all features (including breathing range and resting expiratory level) are estimated in an incremental fashion...
متن کاملDutch and English toddlers' use of linguistic cues in predicting upcoming turn transitions
Adults achieve successful coordination during conversation by using prosodic and lexicosyntactic cues to predict upcoming changes in speakership. We examined the relative weight of these linguistic cues in the prediction of upcoming turn structure by toddlers learning Dutch (Experiment 1; N = 21) and British English (Experiment 2; N = 20) and adult control participants (Dutch: N = 16; English: ...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملConToVi: Multi-Party Conversation Exploration using Topic-Space Views
We introduce a novel visual analytics approach to analyze speaker behavior patterns in multi-party conversations. We propose Topic-Space Views to track the movement of speakers across the thematic landscape of a conversation. Our tool is designed to assist political science scholars in exploring the dynamics of a conversation over time to generate and prove hypotheses about speaker interactions...
متن کامل